Online Pattern Recognition in Multivariate Data Streams using Unsupervised Learning

نویسندگان

  • Devina Desai
  • Tim Oates
چکیده

Extracting patterns from data streams incrementally using bounded memory and bounded time is a difficult task. Traditional metrics for similarity search such as Euclidean distance solve the problem of difference in amplitudes between static time series prior to comparison by normalizing them. However, such a technique cannot be applied to data streams since the entire data is not available at any given time. In this paper, we propose an algorithm that finds patterns in data streams in an incremental manner by constantly observing only a fixed sized window through which the data stream flows. Our algorithm employs an unsupervised approach by which it randomly samples the data streams and eventually identifies good patterns by comparing samples with one another. It uses local information to find patterns globally over the data stream. The metric used for comparison is Euclidean distance between the first derivatives of the data points of the candidate patterns. Using such a metric eliminates the problem of different scales in the data streams. The output of the algorithm is a representation of the pattern in the form of normal distributions that characterize the ranges of values that might be observered. We show that our algorithm can be used to extract interesting patterns from both univariate as well as multivariate data streams.We measure the performance of our algorithm using standard metrics like precision and recall defined according to our context.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison Between Unsupervised and Supervise Fuzzy Clustering Method in Interactive Mode to Obtain the Best Result for Extract Subtle Patterns from Seismic Facies Maps

Pattern recognition on seismic data is a useful technique for generating seismic facies maps that capture changes in the geological depositional setting. Seismic facies analysis can be performed using the supervised and unsupervised pattern recognition methods. Each of these methods has its own advantages and disadvantages. In this paper, we compared and evaluated the capability of two unsuperv...

متن کامل

Unsupervised identification and recognition of situations for high-dimensional sensori-motor streams

An important question in self-learning robots is how robots can autonomously learn about and act in their environment in an on-line and unsupervised manner. This paper introduces and evaluates Context Recognition in Data Streams (CoRDS), a method that enables a robot to identify and recognise different situations in its environment. CoRDS achieves this by processing the data stream from the rob...

متن کامل

Choosing ‘codebooks’ for self-organising maps: A Case Study

Statistical pattern recognition techniques, supervised and unsupervised classification techniques being two good examples here, rely on the computations of similarity and distance metrics. These metrics are computed for an unknown pattern, say, x, and a reference vector, xk, selected usually by human beings. This means that when the training regimen leads to a finite set of categories it would ...

متن کامل

Unsupervised Learning of Patterns in Data Streams Using Compression and Edit Distance

Many unsupervised learning methods for recognising patterns in data streams are based on fixed length data sequences, which makes them unsuitable for applications where the data sequences are of variable length such as in speech recognition, behaviour recognition and text classification. In order to use these methods on variable length data sequences, a pre-processing step is required to manual...

متن کامل

An unsupervised data projection that preserves the cluster structure

Article history: Received 26 September 2010 Available online 2 November 2011 Communicated by G. Borgefors

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003